Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zecong Hu

A Data-Centric Framework for Composable NLP Workflows

Mar 03, 2021

Zhengzhong Liu, Guanxiong Ding, Avinash Bukkittu, Mansi Gupta, Pengzhi Gao, Atif Ahmed, Shikun Zhang, Xin Gao, Swapnil Singhavi, Linwei Li(+7 more)

Figure 1 for A Data-Centric Framework for Composable NLP Workflows

Figure 2 for A Data-Centric Framework for Composable NLP Workflows

Figure 3 for A Data-Centric Framework for Composable NLP Workflows

Figure 4 for A Data-Centric Framework for Composable NLP Workflows

Abstract:Empirical natural language processing (NLP) systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components, ranging from data ingestion, human annotation, to text retrieval, analysis, generation, and visualization. We establish a unified open-source framework to support fast development of such sophisticated NLP workflows in a composable manner. The framework introduces a uniform data representation to encode heterogeneous results by a wide range of NLP tasks. It offers a large repository of processors for NLP tasks, visualization, and annotation, which can be easily assembled with full interoperability under the unified representation. The highly extensible framework allows plugging in custom processors from external off-the-shelf NLP and deep learning libraries. The whole framework is delivered through two modularized yet integratable open-source projects, namely Forte1 (for workflow infrastructure and NLP function processors) and Stave2 (for user interaction, visualization, and annotation).

* 8 pages, 4 figures, EMNLP 2020

Via

Access Paper or Ask Questions

Latent Relation Language Models

Aug 21, 2019

Hiroaki Hayashi, Zecong Hu, Chenyan Xiong, Graham Neubig

Figure 1 for Latent Relation Language Models

Figure 2 for Latent Relation Language Models

Figure 3 for Latent Relation Language Models

Figure 4 for Latent Relation Language Models

Abstract:In this paper, we propose Latent Relation Language Models (LRLMs), a class of language models that parameterizes the joint distribution over the words in a document and the entities that occur therein via knowledge graph relations. This model has a number of attractive properties: it not only improves language modeling performance, but is also able to annotate the posterior probability of entity spans for a given text through relations. Experiments demonstrate empirical improvements over both a word-based baseline language model and a previous approach that incorporates knowledge graph information. Qualitative analysis further demonstrates the proposed model's ability to learn to predict appropriate relations in context.

Via

Access Paper or Ask Questions

Stack-Pointer Networks for Dependency Parsing

May 03, 2018

Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, Eduard Hovy

Figure 1 for Stack-Pointer Networks for Dependency Parsing

Figure 2 for Stack-Pointer Networks for Dependency Parsing

Figure 3 for Stack-Pointer Networks for Dependency Parsing

Figure 4 for Stack-Pointer Networks for Dependency Parsing

Abstract:We introduce a novel architecture for dependency parsing: \emph{stack-pointer networks} (\textbf{\textsc{StackPtr}}). Combining pointer networks~\citep{vinyals2015pointer} with an internal stack, the proposed model first reads and encodes the whole sentence, then builds the dependency tree top-down (from root-to-leaf) in a depth-first fashion. The stack tracks the status of the depth-first search and the pointer networks select one child for the word at the top of the stack at each step. The \textsc{StackPtr} parser benefits from the information of the whole sentence and all previously derived subtree structures, and removes the left-to-right restriction in classical transition-based parsers. Yet, the number of steps for building any (including non-projective) parse tree is linear in the length of the sentence just as other transition-based parsers, yielding an efficient decoding algorithm with $O(n^2)$ time complexity. We evaluate our model on 29 treebanks spanning 20 languages and different dependency annotation schemas, and achieve state-of-the-art performance on 21 of them.

* Accepted by ACL-2018

Via

Access Paper or Ask Questions